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Abstract 

The research on biomarkers has been limited in its effectiveness because biomarker levels can 
only be measured within the thresholds of assays and laboratory instruments, a challenge referred 
to as a detection Umit (DL) problem. In this paper, we propose a Bayesian approach to the Cox 
proportional hazards model with explanatory variables subject to lower, upper, or interval DLs. 
We demonstrate that by formulating the time-to-event outcome using the Poisson density with 
counting process notation, implementing the proposed approach in the OpenBUGS and JAGS is 
straightforward. We have conducted extensive simulations to compare the proposed Bayesian 
approach to the other four commonly used methods and to evaluate its robustness with respect to 
the distribution assumption of the biomarkers. The proposed Bayesian approach and other 
methods were applied to an acute lung injury study, in which a panel of cytokine biomarkers was 
studied for the biomarkers' association with ventilation-free survival. 
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INTRODUCTION 

Biomarkers have been increasingly and widely used in clinical practice in recent years for 
disease diagnosis and prognosis based on their underlying pathological and physiologic 
mechanisms [1]. In these applications, biomarkers are used either to identify a subgroup of a 
study population or predict a disease outcome [2]. While some biomarkers are involved in 
the early development of a condition and thus might provide diagnoses [3, 4], others are 
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associated more with disease outcome and are considered prognostic of patient sm^vival [5]. 
For example, a nationwide cohort study revealed that a traditional serum biomarker 
leukotriene pathway agent was associated with the breast cancer; p53 expression in primary 
tumors was an independent prognostic factor that influenced relapse-free survival in stage II 
patients, and lack of Bcl-2 expression was independently associated with a poor prognosis 
among stage III patients [6]. Another large cohort study identified protein biomarker 
CTNNBl to be associated with improved survival in colorectal cancer [3]. Discrepancies 
between studies on the same biomarkers have been observed [7], however, and a variety of 
problems have been cited as the cause of these discrepancies including inappropriate 
statistical analyses [8]. As a result, in 2005, REporting recommendations for tumor 
MARKer prognostic studies (RE-MARK) was published [7]. One of the goals of these 
guidelines was to improve the usefulness of the results from clinical prognostic studies and 
enhance the comparability between studies. 

Detection limit (DL) is a measurement error problem with bounded error [9, 10]. In 
particular, DL is a measurement problem in which the actual biomarker values are 
immeasurable either below the lower detection limit (LDL) or above the upper detection 
limit (UDL) of laboratory instruments. These values are often called non-detects. The 
applicability of biomarkers in clinical practice has been compromised in the presence of DL 
as inappropriate use of statistical methods when dealing with DL may lead to biased 
conclusions and inconsistent results [11]. Simple methods, such as deletion and single 
replacement with one-half of the LDL, were often used to fill in the immeasurable values. 
Some sophisticated methods have been proposed to replace the missing values based on the 
distribution of the observed values. Regression on Order Statistic (ROS) is one such 
approach; however, its usage is severely limited because its goal is to estimate summary 
statistics [12]. More recently, vast researches have been conducted in DL and its related 
areas. For example, an MLE-based approach has been proposed to handle both binary [13] 
and continuous outcomes [14] with an independent variable subject to DL; a semiparametric 
Bayesian method was proposed under proportional hazards model for interval censored data 
with frailty effects [15]; a semiparametric imputation approach has been developed for 
covariates subject to DL, in which the conditional quantiles of the censored covariates are 
assumed to be linear in the observed variables; [16, 17] have proposed a Bayesian approach 
to logistic regression parameter estimation with exposure variables subject to DL; [18] 
discussed the full Bayesian estimation of joint models when time dependent covariates or 
outcomes are submitted to lower detection levels. These works have improved the 
usefulness of biomarkers in the development of diagnostic and prognostic models. However, 
in general there remain a few problems to be solved. To our knowledge, many of the 
medical researches involving Cox proportional hazards models with DL biomarkers still 
relied on simple naive methods like deletion or single replacement to obtain the estimates of 
hazard ratio (HR), mainly due to the computational burden of advanced methods and lacking 
of user-accessible programs. Given promising prognostic value and increased clinical 
application of biomarkers in disease outcome prediction, a survival model that is capable of 
handling DL is desired. In this paper, we formulated the time-to-event outcome using the 
Poisson density with counting process notation, and provided straightforward JAGS/ 
OpenBUGS programs to implement the method. In the current clinical practice, a panel of 
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biomarkers, rather than a single biomarker, is often considered for disease diagnosis/ 
prognosis [19], and therefore, we aim to develop a method to handle simultaneous DL issues 
on multiple biomarkers. 

The paper is organized as follows. In the Motivating Study Section, we provide a detailed 
description of the motivating study which illustrates the association between biomarkers and 
the survival function for patients with acute lung injury (ALI). In the Methodological 
Development Section, we describe the proposed Bayesian method for the cases in which 
single or multiple explanatory variables are subject to lower, upper, and interval DLs. In the 
Simulation Studies Section, we present an extensive simulation study to examine the 
performance of the proposed method, comparing it with four existing methods. We revisit 
the motivating study in the section of Analysis of the Acute Lung Injury Study, followed by 
the Discussion Section. 

MOTIVATING STUDY 

With the goal of developing a prognostic model for Acute Lung Injury/Acute Respiratory 
Distress Syndrome (ALI/ARDS), the researcher measured 8 cytokine biomarkers that reflect 
the complex pathogenesis of ALI/ARDS in baseline plasma from 549 patients [20]. The 
patients were enrolled in the National Heart, Lung, and Blood Institute (NHLBI) ARDS 
Clinical Trials Network clinical trial of two different levels of positive end-expiratory 
pressure [20]. The collected biomarkers included markers of inflammation (IL6, IL8, and 
TNFRl), lung and systemic endothelial activation and injury (VWF), lung epithelial injury 
(SP-D), adhesion molecules (ICAM.l), and activation of coagulation and inhibition of 
fibrinolysis (protein C, and PAI-1). Among these biomarkers, IL8 had 35% of values that 
were below the DL threshold of 2.5 pg/ml of the enzyme-linked immunosorbent assay. 
Collected clinical data include age, cause of ALI/ARDS, severity of illness scoring 
(APACHE III), ventilator parameters, hemodynamic data, and alveolar-arterial O2 
difference (A-a O2). Cox proportional hazards models were used to investigate the 
association between the biomarker levels and time to ventilation removal (VR). 

METHODOLOGICAL DEVELOPMENT 

Let Tj and Q represent the failure and censoring times, respectively, for the rth patient, 
where ; = 1,- ■ Let Xi be the transformed biomarker covariates subject to detection limit 
with a known transformation function g{-) and Z,- be the additional covariates. Observed 
times to event are denoted by Yj = Tj A C,-, and the observed event indicators are denoted by 
A,- = I(Tf < Cj) where a A b = min(a, b) and 1(A) is an indicator function taking the value 1 
when condition A holds and 0 otherwise. 

In this paper, we consider the Cox proportional hazards model [21] given by 

Xiit\Zi, Xi)=Xo{t)exp{fXi+-f'^Zi)dt (1) 

where X{t\Z,X) is the conditional hazard function of time-to-event given the covariates, Xo(t) 
is an unknown baseline function, P is a qxl vector of regression coefficients corresponding 
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to transformed biomarker covariates X,-, and 7 is a pxl vector of regression coefficients for 
other additional covariates Z,-. 

Single Covariate Subject to DL 

We first consider the case of one covariate subject to DL (q=l) and then generahze this 
method to the scenario with muhiple covariates subject to DL (q>l) in the next section. 
With q=L we assume the lower DL for the transformed biomarker measurement x is Id. The 
observed data are 

Dobs^ili^'^i, ^i^{^M<i),ri^I{^i < ld),Zi,i^l,2, - ■ ■ ,n), 

where aV b = max(a, b). The likelihood function of the observed data is 

L=]X^^^[Lyi{yi\xi,Zi,,d,'j)f{x*\z„(fi)]^^~'''^ ^fg^^^Lyi{y,\x,Zi,/3,-f)f(x\zi,(f)dx] ' (2) 

where Lyj(-) is the likelihood of (y,-, 4) and (f) is a vector of regression coefficients in the 
biomarker model f[x*\zi, ip), both will be discussed later. Note that the lower limit of the 
integration in (2) is g(0) because biomarker measurements have support (0,oo), and x is the 
transformed biomarker measurement with transformation function g(-). 

Extending the proposed method to interval DL (both lower and upper DLs) is 
straightforward. We assume the upper DL for x is ud and denote the observations as 

Dobs=iyi,Si,x*={x,Wld/\ud),rj,i=I{xi < ld),riu=I{x, > ud), Zi,i=l,2, ■ ■ ■ ,n) 
The hkelihood function of the observed data for the ith subject now becomes 

n 

i=l 

[l''gio)Lyi{yi\x,2i,f3,-f)f{x\zi,(p)dx'^ ''^ 

Multiple Covariates Subject to DL 

When there are two or more covariates subject to DL, f(xlz) in (2) becomes a multivariate 
density function. A simple extension may be to model f(xlz) through a multivariate Gaussian 
distribution; however, the normality assumption may not hold for all biomarkers subject to 
DL. Therefore, following [22, 23], we model the joint distribution of the biomarkers using a 
series of one dimensional conditional densities as 

f{x\z,ip) = f(Xil\z^,Xi2,--- ,Xig,ipi)f{Xi2\zi,Xi3,--- ,Xi3,--- ,Xiq,(p2) ■ ■ ■ f(Xig\zi,ipq), (4) 

where (pj^ is the vector of unknown parameters in the distribution of f{xii^\zp ' "^^ly, (pjj, 
and cf) = ((pi, ■ ■ ■, (pq). The advantage of (4) is that it allows for a more flexible model 
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specification for the joint distribution of/(xlz), and it is especially useful when the 
biomarkers follow different distribution functions. Although the joint model depends on the 
order of conditioning, [22, 23], among others, have shown in the missing data literature that 
estimates are robust with respect to the order of conditioning; however, by no means should 
one discount the importance of carrying out sensitivity analysis for the order of 
conditioning. 

When there are q biomarkers subject to lower DL, the observed data are 

Dobs={yi,Si,X*i = {Xtl'Vldl),r,l=I{x,l <Wl),-- - ,X*^ = {Xiq\Jldq),r,q=I{Xiq < ld),Zi,i = l,2, - ■ ■ 

where Ztfj, is the lower DL of x^. Let Xf _j = (x,i, ■ ■ ■, x,y_i, x, j+j, ■ • •, x,-^) denote all the 
biomarkers from the ;th subject except the jth biomarker. The likelihood function of the !th 
observed value is 

n*=i[/lto)-^^'^2/'|2;i,-j,a;,^i,/3,7)/(a;ij_i,a;|2i,(^)da;]^ ''^ " ^ (5) 
■ ■ ■ [4to) ■ ■ ■ fg%)^yiiyi\^^ ^'"'3' l)fi.x\zi, if)dXi ■ ■ ■ dXg] " 

where /(x,lz,', (f))is defined in (4) 

Counting Process Likelihood for Posterior Computation and Inference 

For purposes of computation and ease of extending the proposed method, we will formulate 
the likelihood of (y,-, 4) using the counting process in this section. In particular, for subject /, 
we observe counting process Nj{t), which counts the number of failures occurred up to time 
t. Following [24], the Cox proportional hazards model in (1) can be formulated using the 
counting process notation introduced by [25] and is given by 

E{dNi{t)\^t-,Z,X}=\o{t)exp{f3'^Xi+j'^Zi)dt (6) 

where ,'Wi- represents all the past history just before time t, and dNi(t) is the increment of 
Ni(t) over the small time interval [t, t+dt]; dNi(t) = 1 if subject i is observed to fail during the 
time interval [t, t+dt], and dNj{t) = 0 otherwise. The observed data from n subjects can be 
rewritten using the counting process given by {Ni{t)I{t < Yi),Yi,Ri,Zi,Ci}, i = l,2,---,n where 

■''ii) " " " ) •^iq) ^nd Ri — (rij,-- ■r(i^). Under the standard non-informative censoring 
assumption, the counting process likelihood of {Nj{f)I(t < given (X, Zi,Ci) is 

^y^= {ll.^y A,(i)'^^^(*^ } exp {-/o^'A,(i)dt}, (7) 

where /I,- (f) = (t)exp(0'^Xi + /Zi)dt. 

Following [24], we can express the counting process likelihood in (7) using Poisson density 

as 
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Lyi=Y[.^^Poisson{dNy- VydAgjexpifi'^Xi+f'^ Zi)) (8) 

where the notation Poisson(x; \i) is the density of a random variable x, which follows a 
Poisson distribution with mean p; t[,-- -Jj are the unique failure times observed in the data, 
Vjj = 1 if subject i is at risk at time f,-. and Vg = 0 otherwise, dNij = 1 if individual i fails at 
time tf and dNjj = 0 otherwise, and dAgj is an increment on the cumulative baseline hazard 

function Ao{t)= ^\^\o{ii)dii. 

The joint posterior density of (fi y,t/Ao(- ),(<!') based on the observed data can be written as 

7r(/3,7,dAo(-),<^5|£'ois) L x 7r(/3, 7, dAo(-), ly^), (9) 

nn 
i=i"^'' is defined in (3) or (5) for the univariate or multivariate biomarkers 

subject DL, respectively, Ly,-. is defined in (8), and 7r(P, y , dAO(-), <!>) is the joint prior for (P, 

Y , dAO(-), 4>)- In practice, we can specify the independent prior and let 7t(P, 7, dAO(-), ^) = 

7r(P)7T(P)7T(dA0(-))Tr('|>)- We can specify the non-informative prior for p, y, and cf) to minimize 

the influence of the prior on the posterior distribution. For the prior of dA( ), we consider the 

conjugate independent increments prior suggested by [26], namely 

dAo{t)r^Gamma{c x dA*o{t), c), where dAQ(t)is a prior guess for Ao(f) and c controls the 
prior precision with small values of c corresponding to weak prior beliefs. 

Selection of Biomarker Density Functions 

Since the distribution of cytokine biomarker measurements is often positively skewed and 
fits a log-normal distribution [27, 28], a logarithmic transformation of g(-) is typically used 
with normality assumption for f(x\zi,<p). A QQ-plot can be used to evaluate this assumption. 
When the normality assumption is violated, other parametric models, such as GLMs, can be 
specified for the biomarkers subject to the DL and implemented easily in WinBUGS or 
JAGS. On the other hand, the selection of biomarker density functions can be viewed as a 
special case of model comparison and be investigated via Bayes factors, model diagnostics, 
and goodness of fit measurements [29]. 

Simulation Studies 

We conducted extensive Monte Carlo simulations to evaluate the performance of the 
proposed Bayesian. We considered 18 scenarios, which included 3 distributions (normal, t25, 
and gamma) for biomarker measurements, 3 percentages of measurements below the DL 
(10%, 30%, 50%), and 2 coefficients (P=0.8 for ti'ue association, and P=0 for null). In 
addition to a biomarker, we also simulated 2 other continuous covariates according to the 
motivating study. We simulated both event time and censoring time with a Weibull 
distribution which used pre-specified baseline hazards for event and censoring and pre- 
specified coefficients for the biomarker and covariates. The censoring proportions of the 
survival time were set to 15% throughout the simulations. For each study scenario, we 
considered 200 subjects and 1,000 simulations. When analyzing power change for the 9 
scenarios of positive association, we used P=0.225 so that the full data analysis had 80% 
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power to reject the null hypothesis at which the biomarker values were simulated from the 
log-normal distribution. 

In this study, we compared the following 5 methods for their performance on the estimates 
of regression coefficients: (1) single replacement of non-detects with one-half of the LDL 
threshold; (2) case-wise deletion of non-detects; (3) regular multiple imputation (MI) 
method; (4) extrapolation by the ROS; and (5) the proposed Bayesian approach. Single 

replacement with one-half DL or ^ is the most popular method. Deletion is simply 
excluding (case-wise) all the observations with values below the threshold in the analysis. 
MI refers to the regular multiple imputation method assuming missing at random [30]. We 
modified the MI procedure proposed by [31] so that the new procedure takes all aspects of 
uncertainty in the imputations into account [32-35]. ROS is a method which is often used in 
environmental science to compute summary statistics [12]. This method fits a regression line 
on a normal probability plot of the uncensored data, using censored values as place holders, 
and estimates the model parameters from the regression line, including mean and standard 
deviation (SD). We used the ROS-generated mean and SD to simulate values below the DL 
according to a truncated normal distribution. Cox regression was then used to estimate the 
hazard with the complete data set that combined the simulated values with the observed 
data. The simulation of the truncated normal distribution and the following Cox proportional 
hazards model were repeated C times to adjust for the uncertainty due to the simulation [36, 
37]. The point estimate was the average of the C imputed data sets, and the standard error 
(SE) was calculated using Rubin's rule [38]. In the simulation of the truncated data set, we 
used C = 10. Note that the ROS algorithm adopted here is a modified version which properly 
accounts for the simulation variability. The original ROS in [12] used C = 1, as the goal was 
to estimate summary statistics. 

All computations herein were implemented using R 2.15.2, and Bayesian computation was 
conducted using R package R2jags. As both frequentist and Bayesian methods were 
considered in this paper, we evaluated the methods using bias, empirical SE, average SE, 
root mean square error (RMSE), and coverage probability of 95% confidence interval (CI) 
or Highest Posterior Density (HPD) (95% CP), as well as power for true |3 and type I error 
rate for null |3. 

Simulation I 

This simulation evaluated the model performance model when the distribution of the 
biomarker measurements was correctly specified. Especially, we simulated the 
logarithmically transformed biomarker values from N (3.7, 1.25). All the parameter 
estimates were computed using a multivariable Cox proportional hazards model in which no 
correlation was assumed for the covariates. As in Table 1, the parameter estimates did not 
show much difference across the 5 methods at 10% LDL, but the difference became 
substantial at higher LDL fractions. The method of 1/2 DL had the largest bias and the 
lowest 95% CI coverage. MI and ROS also had substantial biases, large RMSEs, and poor 
coverage probabilities (Table 1). Although the deletion method had negligible bias, it had a 
larger RMSE and poorer power compared to the Bayesian method (Table 1 and Figure 1). 
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Overall, the Bayesian method outperformed the other methods in terms of RMSE, CP, and 
power. 

Simulation II 

We investigated the robustness of the proposed Bayesian method with respect to violation of 
the log-normal assumption of the biomarker measurements. We simulated the data in the 
same manner as in Simulation I except that the logarithmically transformed biomarker 
values were obtained from a heavier tailed t25 distribution with mean 3.7. The study showed 
similar results as Simulation I. The Bayesian method still outperformed other methods for 
this mild misspecification of the distribution, although its coverage probability started to fall 
below the nominal rate with increased DL fraction (Table 2 and Figure 1). 

Simulation III 

We further evaluated the proposed Bayesian method for robustness with severe 
misspecification of the underlying normal distribution using a highly skewed gamma 
distribution. We simulated the data as above, except that the logarithmically transformed 
biomarker was specified as a Gamma(0.5,l) distribution. For this serious misspecification, 
the proposed Bayesian method had increased bias and lower CP. As shown in Figure 2, the 
normality assumption is clearly violated judged from the QQ-plot against a normal 
distribution for logarithmic transformation of the DL biomarker variable in a randomly 
selected simulated data set. Interestingly, single replacement with the one-half DL method 
had very small bias, and the bias did not increase with higher DL fractions. This is because 
Gamma(0.5, 1) is highly positively skewed and has 10%, 30% and 50% threshold values of 
0.008, 0.074, and 0.227 with coiTesponding half DL values of 0.004, 0.037, and 0.114, 
respectively. ROS, another distribution-based method, performed the worst in this setting 
(Table 3). Although the deletion method showed merit in this scenario, its estimates had 
high efficiency loss. Furthermore, as shown in Figure 3 of Section 5, the deletion method 
did not provide valid summary statistics, such as mean or median, which is directly related 
to the distribution of the DL biomarker. 

ANALYSIS OF THE ACUTE LUNG INJURY STUDY 

The proposed method was applied to analyze data from the acute lung injury (ALI) study 
introduced in the Section of motivating study. In this analysis, we are interested in the 
association between biomarker IL8 and time to VR, controlling for age and alveolar-arterial 
O2 difference (A. a). We used the proposed Bayesian method to study the association 
between IL8 and VR and compared the Bayesian result with the other 4 methods. A log- 
normal distribution seemed reasonable based on the QQ-plot of the observed data, although 
38% of the IL8 data in this study are subject to the DL (Figure 4 left panel). Trace plots 
were used to evaluate the convergence of the parameters (not shown due to limited number 
of tables and figures). Based on the Cox proportional hazards model fit with the imputed 
data, a moderate negative association was found in the Bayesian method for IL8 with the 
"hazard" of removing ventilation (HR = 0.83; 95% HPD: 0.78-0.89), suggesting the 
removal of ventilation was less likely for patients with higher IL8 (Single Biomarker 
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Analysis in Table 4). The other methods yielded HR estimates ranging from 0.72 to 0.85 
with larger standard error estimates compared to the Bayesian approach. 

We also used the ALI study to demonstrate the methods in the scenario with more than one 
biomarker subject to DL. To this end, we truncated ICAM.l to generate a dataset with 30% 
observations below DL. We also performed a sensitivity analysis to evaluate the impact of 
sequential conditioning of the two biomarkers subject to DL. In particular, we considered 
the Cox proportional hazards model 

X[t\]og{IL8)Aog{ICAMJ),age,A.a]=Xo{t)exp[(3o+/3ilog{IL8)+f32\og{ICAM.l)+(3sage+f3^A.a] 
and assume 

f(log{IL8),log{ICAM.l)\age,A.a)=f{\og{IL8) \age, A.a,\og{ICAM.l))f{logiICAM.l)\age, A.a), 
or 

f(logiIL8)Aog{ICAM.l)\age,A.a)=f{log{ICAM.l) \age,A.a,\og(IL8))fi\og{IL8)\age,A.a) ( 

for the joint distribution of IL8 and ICAM.l. Figure 3 shows the histograms of log(ICAM.l) 
that combine the observed and imputed values of each method, as well as the full original 
data. Only the ROS and Bayesian methods yielded distributions that are close to the original 
data. Recall that the ROS method did not provide valid estimates and inferences for the 
regression coefficient of the DL biomarker in the previous simulation studies. This 
observation is not surprising as the ROS method was originally proposed to estimate 
summary statistics, such as the mean of the biomarker, but not to estimate the regression 
coefficient of the DL biomarker. The log-normal assumption for ICAM. 1 was evaluated by 
QQ-plot of the observed data (Figure 4, right panel). Table 4 shows that the Bayesian HR 
estimates are robust to the order of the sequential conditioning (HR = 0.84 with 95% HPD 
0.78-0.90 in Bayesianl versus HR = 0.84 with 95% HPD 0.79-0.90 in Bayesian2, Multiple 
Biomarker Analysis in Table 4). ROS estimates were left in blank in Table 4 because the 
NADA package in R only handles single biomarker subject to DL. 

DISCUSSION 

This article proposes a general Bayesian approach for the Cox proportional hazards model 
with explanatory measurement variables subject to DL. We focused on the Cox proportional 
hazards model as it is the most widely-used model for survival analysis. The validity and 
application of the proposed approach do not rely on the proportional hazards assumption of 
the Cox model, thus, generalizing the method to other time-to-event models and 
incorporating a variety of techniques in Bayesian inference and diagnostics are 
straightforward [29]. With the counting process notation, we can extend our method to the 
Cox model with time-dependent covariates and random effect (frailty) models for multiple 
event time data, among others. The JAGS code in the Appendix can be easily modified to 
incorporate the extension. 
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The proposed Bayesian method performed well when biomarker measurement distribution 
was correctly specified or mildly misspecified as shown in the motivating example and 
Simulations I and II. However, the proposed Bayesian method was not robust to severe 
misspecification of the underlying distribution as shown in Simulation III. Our study 
demonstrates the importance of an appropriate specification of DL variable distribution in 
improving the model performance. The QQ-plot approach or model selection criteria such as 
deviance information criterion can be used to guide the distribution specification in this 
setting. When the normality assumption is violated, other parametric models can be 
specified for the biomarkers subject to the DL and implemented easily in JAGS or 
OpenBUGS by modifying the example programs given in the Appendix. If the parametric 
distribution assumption is reasonable, the proposed Bayesian approach can yield valid and 
efficient inference with joint posterior modehng for covariates with nondetects and an 
outcome variable. Furthermore, in order to cope with the challenges of the common practice 
of the multiple biomarker approach in disease outcome prediction [19], we extended the 
proposed Bayesian method to the case of multiple biomarkers subject to the DL through a 
sequence of conditional distributions. In this situation, a sensitivity analysis needs to be 
considered to access the effect of the order of conditioning on the biomarkers. 
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APPENDIX 

JAGS code is provided below. Centered values are used in continuous variables. 

data 
( 

for(i in 1:N) ( 
for(j in 1:T) ( 

Y[i,j] <- step (obs . t [i] -t [ j ] +eps) 

dN[i,j] <- Y [i, j] *step (t [ j+1] -obs .t [i] -eps) *fail [i] 

} 

} 

} 

model 
{ 

for(j in 1:T) ( 
for(i in 1:N) ( 
dN[i,j]~ dpois(Idt[i, j]) 

Idt[i,j] <- Y [i, j] *dLO [ j] *exp (beta* (Z [i] - 
Z . c) +beta2* (age [i] -age . c) 
+beta3* (il8 [i] -il8 . c) ) 
} 
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dLO[j] ~ dgamma (mu [ j ] , c) 
mu[j] <- dLO . star [ j ] *c 
} 

for(i in 1 ;N) { 

cens.il8[i] ~ dinterval (logLDL, il8 [i] ) 
il8[i] - dnorm (mu . il8 [i] , tau) 
mu.il8[i] <- alphaO+alpha2* (Z [i] - 
Z . c) +alpha3* (age [i] -age . c) 
} 

c <- 0 .001 
r <- 0.1 

for (j in 1 :T) { 

dLO.star[j] <- r* (t [ j+1 ] -t [ j ] ) 

1 

} 
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Figure 1. 

Power Analysis for Simulation Studies. 
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Histogram Plot and QQ-plot for a Randomly Selected Dataset from Simulation III. 
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Histogram Plot of log(ICAM. 1) in ALI Study Completed with Imputed Values. 
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QQ-plots for Logarithmic Transformation of Observed log(IL8) and log(ICAM-l) in ALI 
Study. 
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